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Abstract 

When comparing 2D shapes, a key issue is their normalization. 
Translation and scale are easily taken care of by removing the mean 
and normalizing the energy. However, defining and computing the ori- 
entation of a 2D shape is not so simple. In fact, although for elongated 
shapes the principal axis can be used to define one of two possible ori- 
entations, there is no such tool for general shapes. As we show in the 
paper, previous approaches fail to compute the orientation of even 
noiseless observations of simple shapes. We address this problem. In 
the paper, we show how to uniquely define the orientation of an arbi- 
trary 2D shape, in terms of what we call its Principal Moments. We 
show that a small subset of these moments suffice to represent the un- 
derlying 2D shape and propose a new method to efficiently compute 
the shape orientation: Principal Moment Analysis. Finally, we discuss 
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how this method can further be applied to normalize grey-level images. 
Besides the theoretical proof of correctness, we describe experiments 
demonstrating robustness to noise and illustrating the method with 
real images. 



1 Introduction 

Representing shape is a challenging task. In fact, unlike local characteristics 
like color, which can be uniquely determined by a small set of parameters, or 
texture, which has been successfully captured by using statistical descriptors, 
the visual information conveyed by a shape, of more global nature and easily 
perceived by humans, is hard to represent in an appropriate way. This paper 
deals with two-dimensional (2D) shape representation. 

When the 2D shapes to describe are simply connected regions, researchers 



have used contour-based descriptions (e.g., Chauang and Kuo, 1996; Bar 



tolini et al, 2005). Naturally, for more general shapes, usually consisting in 
arbitrary sets of points, or landmarks, these descriptors are not adequate. 
If the points describing the shape are labeled, i.e., if the correspondences 
between the landmarks of two shapes to compare are known, the problem 
reduces to the impact of geometrical transformations and disturbances, el- 



egantly addressed through the statistical theory of shape of Kendall et al 



(1999). However, in many practical scenarios, the shape points are obtained 



from an automatic process, e.g., edge or corner detection, thus come without 
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labels or natural ordering. 

Estimating the correspondences between points of two shapes leads to 
a combinatorial problem, which requires prohibitively time-consuming algo- 
rithms, even for shapes described by a moderate number of landmarks. To 
circumvent this problem, researchers have recently attempted to come up 
with permutation-invariant representations for sets of points. For example, 



Jebara (2003) shows how to factor out unknown labels through the solu- 



tion of a convex optimization problem over the set of permutation matrices, 



and Rodrigues et al (2008a) proposes a permutation-invariant representation 
obtained by densely sampling an analytic function. 

Naturally, the study of shape representation if often motivated by the 
challenge of comparing two arbitrary shapes. Indeed, if an efficient and uni- 
versal way of representing shapes is found, the comparison of two shapes can 
be brought off through the direct comparison of its representations. Regard- 
ing universality, an issue that arises when representing and comparing two 
shapes is their normalization with respect to geometrical transformations, 
such as translation, scale and rotation. In fact, a desirable requirement of 
shape representations is that they should form a complete set of invariants 
with respect to these transformations, i.e., two shapes should be equal up to 
a combination of these transformations if and only if their representations 
are equal. Translation and scale are easily taken care of by removing the 
mean and normalizing the energy of the shape. Rotation is correspondingly 
factored out by normalizing according to an orientation of the shape, an an- 
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gle, intrinsic to the shape, that varies coherently when the shape is rotated. 
However, as we discuss in the following paragraphs, defining and computing 
the orientation of an arbitrary 2D shape is not so simple. 

Additionally, as motivated earlier, complete permutation invariance is a 
key requirement of unlabled shape representations as well. Nevertheless, the 
comparison of two sets of points that are related by an unknown transforma- 
tion that includes, simultaneously, a 2D rotation, due to different orientation, 
and a permutation, due to the absence of labels for the points, results highly 
complex, due to the fact that estimating the transformation leads to a non- 
convex problem. Iterative methods have been used to compute, in alternate 
steps, rotation and permutation: the Iterative Closest Point (ICP) algorithm 



of Besl and McKay (1992), or its probabilistic versions based on Expectation- 



Maximization (EM) (e.g. j McNeill and Vijayakumar, 2006). However, these 
approaches suffer from the usual sensitivity to the initialization, exhibiting 
uncertain convergence. Other proposed approaches exhibit drawbacks as 



well: the convex optimization approach of Jebara (2003) does not deal with 



rotation and the analytic representation of Rodrigues et al (2008a) is not 
rotation invariant, requiring pairwise alignment. 

The most straightforward method to define orientation uses the principal 
axis of the shape, obtained, e.g., through Principal Component Analysis 
(PC A). Although unable to provide a unique orientation, this method defines 
two possible orientations for elongated shapes, see Fig. [j] (the directional 
ambiguity happens in general, not only for mirror-symmetric shapes such as 



4 



the one used for illustration). For shapes that do not have a well defined 
principal axis, e.g., rotationally symmetric shapes, PCA-based orientation is 
completely ambiguous, see Fig. [2j 




Figure 2: A rotationally symmetric shape. Since for this kind of shapes 
the principal axis is not defined, PCA can not be used to compute their 
orientation. 

To deal with these ambiguities, researchers attempted to work with con- 



cepts like mirror-symmetry axes (Atallah, 1985; Marola, 1989), universal 



principal axis (Lin, 1993), and generalized principal axis (Tsai and Chou 



1991; Zunic et al, 2006). In general, the motivation for these works is more 



on the definition of a "reasonable geometric orientation" than on the ro- 
bust computation of a unique orientation angle for arbitrary shapes. Since 
rotationally symmetric shapes are particularly challenging, the automatic de- 
tection of symmetry and fold number, by itself a relevant problem, has also 



received attention (Lin et al, 1994; Shen and Ip, 1999; Derrode and Ghorbel 



2004; Prasad and Yegnanarayana, 2004). 



The more theoretically sustained methods to compute orientation are 
based on the geometric moments of the points defining the shape. In partic- 
ular, the so-called Complex Moments (CMs) were introduced in the eighties 



(Abu-Mostafa and Psaltis, 1985; Teh and Chin, 1988). The elegance of these 
approaches comes from defining the orientation through the phase of a sin- 
gle CM of a particular order. In the nineties, more general moments were 



proposed to deal with degenerate shapes (Shen and Ip, 1997, 1999), at a 
cost of dealing with several moments, chosen by tuning a free parameter in- 
dex through search, and detecting rotational symmetry as an intermediate 
step. However, as we detail in Section [5| these methods do not cope with 
several shapes that lead to singular moments. In practice, this means that 
the phase of these moments is sensitive to noise, leading to unstable esti- 
mates of orientation. Other approaches require the exhaustive search for the 



angle maximizing a given orientation measure (Ha and Moura, 2003, 2005), 
without any guarantee of uniqueness of the solution. 
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In this paper, we address the need to combine compact descriptions of 2D 
shapes, for computational efficiency, with the invariance and discriminative 
power of complete sets of invariants. We propose to represent 2D shapes 
in terms of particular complex moments, which we call the Principal Mo- 
ments (PMs). Although moments of image patterns have been extensively 



used due to their invariance properties, since at least the early sixties (Hu 



1962), their discriminative properties have not been studied even in more 



recent related work (e.g., Abu-Mostafa and Psaltis, 1985; Khotanzad and 



Hong, 


1990; 


Shen and Ip, 


1999; 



representation uniquely defines the shape. In fact, using the same number 
of PMs as the number of shape landmarks, our representation forms a com- 
plete set of invariants with respect to permutation. We further show that the 
PMs coincide with the coefficients of the Fourier series of the representation 



of Rodrigues et al (2008a), a result that guarantees that our representation 
inherits the discriminative power demonstrated by the experiments reported 
in that paper. Subsequently, we derive an upper bound for the magnitude 
of these coefficients in terms of the shape complexity, i.e., of the number of 
landmarks. Using these results, we show that our representation is compact, 
in the sense that a small number of PMs (much smaller than the number 
of landmarks) suffices to represent the shape. This compactness contrasts 
with the usual large dimension of other complete representations (e.g., those 



based on the bispectrum (Kondor, 2008) or the densely sampled functions of 



Rodrigues et al (2008a)), an issue of outmost relevance when working with 
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large databases. 

In what respects to geometric transformations, besides trivially extending 
the representation to deal with translation and scale transformations, a ma- 
jor contribution of this paper is the extension to also include rotations. This 
extension consists in previously rotating each shape instance of a normaliza- 
tion angle which defines the shape orientation. Overcoming the limitations 
discussed above, we present a new method to define and compute a unique 
orientation of any 2D shape, based on its PMs. More specifically, we show 
that the phases of two of these moments unambiguously define the orienta- 
tion of an arbitrary 2D shape (including rotationally symmetric ones) and 
propose an algorithm, Principal Moment Analysis (PMA), that computes the 
orientation angle by integrating the contributions of all pairs of moments. 

Naturally, PMA can be used to normalize arbitrary 2D shapes, e.g., bi- 
nary images, with respect to orientation, before any other processing takes 
place. In the paper we also discuss the straightforward extension of PMA to 
the normalization of general gray-level images. 

Besides theoretically sound, PMA results are robust to noise, as the ex- 
periments in the paper illustrate. 

The remaining of the paper is organized as follows. Section [2] introduces 
the PMs and relates them to previously proposed moments. In Section [3j we 



relate the PMs with the representation of Rodrigues et al (2008a) and derive 
an upper bound for the length containing most of its energy, which enables 
ending up with a compact shape representation. Section [4] describes our 
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approach to extend the representation towards obtaining a complete set of 
invariants with respect to 2D rotation. In Section [5| we detail the limitations 
of current methods when estimating 2D orientation. Section [6] presents PMA, 
our algorithm for computing a unique orientation of an arbitrary 2D shape 
from its PMs. In Section [7j we extend PMA to the orientation normalization 
of gray-level images. Section [8] contains experiments and Section [9] concludes 
the paper. 

2 Principal Moments for 2D Shape Repre- 
sentation 

Consider an arbitrary 2D shape described by a set of N points in the plain, 
thus by an A-dimensional complex vector z G C^, containing their coordi- 
nates: 



xi + jyi 




Z\ 


X2 + jV2 




Z2 


%N + JVn 




Zn 



(1) 



Naturally, since the points do not have labels, the same shape can be de- 
scribed by any vector obtained from z by re-ordering its entries. 
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We define the /c th -order Principal Moment (PM), fee {1,2, 3,...}, by 



n=l 



The k th PM, after stripping off the scaling factor l/(iVfc!), is also known as 
the k th power sum of the landmarks {24, z 2) . . . , Zn}. The inclusion of this 
particular scaling factor in the definition ^ is motivated in the next section. 
Through the Newton's identities, the first TV power sums can be unambigu- 
ously converted into the so-called fundamental symmetric polynomials of the 
landmarks. As these polynomials can be seen as the coefficients (up to sign 
changes) of the 7V th -order univariate polynomial containing the landmarks 
as roots, there is a bijection between this univariate polynomial and the first 



N power sums. The reader is referred to the enlightening book of Kanatani 



(1990) for what respects to these equivalences. Since an 7V th -order univariate 
polynomial unequivocally defines its TV roots, although orderless, the first TV 
power sums unambiguously define the shape z, up to a permutation, and 
so do the first TV PMs. This TV-sized representation is thus said to form a 
complete set of invariants over the permutation group, or, equivalently, to 
be maximally invariant to permutations. In other words, (i) any vector ob- 
tained from z by re-ordering its entries leads to the same PMs (permutation 
invariance) ; and (ii) any vector with at least one landmark lying at a different 
position than the ones in z, leads to distinct PMs (discrimination). 

The defined representation can be trivially extended to be maximally 
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invariant with respect to simple geometric transformations. Indeed, it is 
easily shown that the simple pre-processing step of working with 



N, 



(3) 



where z = J^ n=1 z n denotes the center of mass of z and || • || denotes the Z 2 
norm, rather than directly with z, extends the complete set of invariants to 
include translation and scale transformations. The extension with respect to 
rotation operations, though, is not so trivial and it is, along with the subject 
of shape representation using PMs, the central subject of this paper. 

Before proceeding, we relate the PMs in ^ with the more general Com- 



plex Moments (CMs) of Abu-Mostafa and Psaltis (1985) and Generalized 



Complex (GC) moments of Shen and Ip (1997, 1999). The CM of order 
(p, q) of an image g{x ) y) is defined by 



r r+oo 

c pq(9) = / (x + jy) p (x-jy) q g(x 1 y)dxdy, 

J J —oo 



(4) 



where p > and < q < p (Abu-Mostafa and Psaltis, 1985). Considering an 



image composed by a set of mass points located at the shape landmarks 
{zi, z 2 , • • • 3 %n}, the integral in Q becomes a sum: 



N 



C7 M (z) = ]>>£«)4, 



(5) 



71=1 



where denotes the complex conjugate of z n . In turn, the GC moment of 
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order (p, q) is given by the integral in polar coordinates 



gcm 



r p e^ q0 g{r cos 0, r sin 9) r dr d9 



(6) 



7T ^0 



where p e {0, 1,2, . . .} and q e {1, 2, 3, . . .} flShen and Tp| |1997| ). For a shape 
z, the GC moments collapse into the sums 



N 



GC pq (z) = ^2\ Zn \ p e 



jgarg z n 



(7) 



n=l 



It is now clear that the PMs are CMs and GC moments of particular 
orders (up to a scaling factor): from ( 2|5|7 ), Mfc(z) = Cko(z)/(Nk\) = 
GCkk(z)/(Nk\). Finally, note also that, for any shape, M (z) = J^Li 1 = 
1, and, assuming the shapes were pre-processed as in ([3]), Mi(z) = 



3 A Compact Shape Representation 

We now show why a representation that uses a small subset of PMs suffices 
in practice. To do this, we start by relating the PMs with the analytic 



signature (ANSIG), the representation introduced by Rodrigues et al (2008a). 
The ANSIG is an analytic function on the complex plane, obtained from z 
through 

1 N 



n=l 
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Rodrigues et al (2008a) show that this representation is complete and 
permutation-invariant and thoughtfully illustrate its capabilities with sev- 
eral shape-based image classification experiments. In these experiments, the 
analytic function (|8j) is described in the computer by its 512 samples uni- 
formly taken on the unit-circle of the complex plane. Working with these 
high dimensional vectors may be adequate for tasks requiring the compari- 
son of a small number of shapes but certainly not for applications that deal 
with very large databases, e.g., the internet. 

In the sequel, we derive that our PMs ^ are intimately related with 
the ANSIG ^ and show that a small number of either PMs or of ANSIG 
samples suffices to represent the shape with similar performance. 

3.1 The Principal Moments and the ANSIG Spectrum 



A direct consequence of Cauchy's integral formula (see, e.g., Ahlfors, 1978) 
is that any analytic function is fully specified by the values it takes on a 
closed contour on the complex plane. Thus, the ANSIG in ^ is equivalently 
described by its restriction to the unit-circle, 

1 N 

h(z, 9) = a(z, e> 9 ) = - ^ exp (z n e^ 9 ) . (9) 

71=1 

To approximate the unit-circle restriction of the ANSIG by a finite- 
dimensional computer representation, we study its frequency spectrum. Note 
that h(z,8) in ^ can be seen as a real- argument complex- valued peri- 
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odic function, with fundamental period T = 2n and fundamental frequency 
ljq — 2tv/T — 1. Thus, it can be written in terms of its Fourier series, 



h(z,0)= ^( z ) e 



(10) 



k= — oo 



where each coefficient is given by (see, e.g., Oppenheim et al, 1996): 



tf fc (z) = ^- / h(z,6)e-i kd de. 



:n) 



The analysis expression (11) is hard to carry out in the case of h(z,9) 
given by However, the coefficients of the Fourier series easily follow from 



the comparison of the synthesis expression (10) with the definition of /i(z, 6) 



m @, after some manipulations. In fact, expressing the exponential in ^ 
by its Maclaurin series (i.e., its Taylor series at the origin), we get 



. N oc 



N ^— ' ^— ' k\ 

n=l k=0 
oo / ^ N 

= £ (TO £ Z " 



Jk6 



fc=0 \ n=l 



(12) 



Comparing (12) with (10), and since the coefficients of the Fourier series 



representation are unique (Oppenheim et al, 1996), we conclude that 



H k (z) 



Jk\ S 



Nk\ 



N y k 
n=l 



for < k < +oc 



(13) 



for 



oc < k < -1 
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Thus, given a 2D shape described by a set of TV landmarks, expression 



(13) relates the values of the coefficients of the Fourier series of the unit- 
circle restriction of its ANSIG, {H^, — oc < k < +oc}, to the positions of the 
landmarks in the plane, {24, z 2 , . . . , Zjy}. Comparing with ([2]), we conclude 
that these coefficients for k > coincide with the PMs: 

H k (z) = M k (z), k = 0,1,2,... . 

Due to the one-to-one correspondence just proved between M^(z), h(z,9) 
and a(z,£), the PM shape representation inherits all the properties of the 
ANSIG. Although the fact that the PMs constitute a complete representation 
was already referred in the previous section (and the fact that they enjoy per- 
mutation invariance is immediate from definition ([2])), we now know they are 
equivalent to the ANSIG also in what respects to discrimination capabilities. 

3.2 Compactness of the Principal Moments 

We now show that the proposed representation is compact, in the sense that 
only a small number of PMs is enough to represent the shape with neglectable 
loss of discrimination power. For that purpose, we derive an upper bound 
b{k) for the magnitude of the coefficient H k (or, equivalently, the PM 
for k > 2, in terms of the number of landmarks describing the shape. Using 
this result, we deliver an upper limit for the (wide-sense) bandwidth of the 
unit-circle restriction h(z,9) of the ANSIG. The first step is done through 
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the following chain of equalities and inequalities: 



\M k 



1 



N 



n=l 
N 



< 



1 N 



Nk\ 



71=1 



1 N k I ( N 

= ]vjfc[ S(W 2 ) 2 - 7w ( 



ra=l 



v n=l 



ATI" 1 

fc! 



d ^ f b(k) 



(14) 

(15) 
(16) 



where (14) uses the triangle inequality, (15) uses the fact that the function 



(a + b) n is convex for a, 6 real positive and n > 1, and (16) is due to the fact 



that X^=i l z ™| 2 = f° r shape vectors z normalized according to 

To estimate the bandwidth of h(z : 6) in terms of the number of landmarks, 
usually, we would seek the smallest k such that the ratio |Mfc|/|M | is below 
a given threshold p. In our case, since M = 1, this would lead to |M&| < p. 
Nevertheless, as we do not know the exact value of |M&|, we will use the 
upper bound b(k) as a proxy. We are not sure to obtain the smallest k that 
satisfies our bandwidth constraint, but we guarantee the satisfaction of the 
inequality, since |M&| < b(k). 

Finding the smallest k such that b(k) < p requires solving the limit case 

k 

equation N% /k\ = p : for which there is no analytic solution. We propose 
a simple method to solve for k numerically. Due to the fast grow of k\ and 
to increase stability, we apply logarithms on both sides of the inequality. 
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Denoting the natural logarithm of b(k) by B{k) ) we have: 



B(k) 



~ k 



\nb(k) = 
'In TV 



- - 1 ) \nN-\nk\ 



M - k Ink 



(17) 



--Infc-IniV--ln27r. 
2 2 



(18) 



where (17) uses the definition of b(k) in (16) and (18) uses the Stirling's 



approximation \nk\ ~ kink — k + Mn27r/c (see, e.g., Paris and Kaminsky 



2001). 



To analyze the behavior of B(k) given by (18), we relax k to the reals 
and express the first two derivatives: 



B <(*) = — -ln*-- 



From these expressions, we see that B(k) has an inflection at k = 1/2, where 
B"(k) = 0, and two extrema at = fci < 1/2 and = fc 2 > 1/2, where 
B'(k) = (assuming > 2). Furthermore, B(k) monotonically decreases 
for k > fc 2 , where B'{k) < 0, being lim/e^+oo = — oc. The plot in Fig. [3] 
illustrates the behavior of B{k) for = 10. To find k such that B(k) is 
below a given threshold lnp, we thus propose the following strategy in two 
steps: first, solve B\k) = in the interval k G [1/2, +oc), obtaining fc 2 . 
Then, solve B(k) — ln(p) = in the interval G [fc 2 , +oc), obtaining k = ks, 
the desired upper bound for the bandwidth of the ANSIG. Note that the first 
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step is necessary to specify the lower limit of the search region for the second 
one: without that limit, we could obtain a spurious solution ks < k 2 . 




k o,5 1 2 k 3 4 5 6 7 

1 2 k 



Figure 3: Upper bound B{k) for the (logarithm of the) magnitude of the 
spectrum of the analytic signature (or, equivalent ly, magnitude of the prin- 
cipal moments), for TV = 10 points. 

We thus conclude that most of the energy (the parameter p controls the 
amount) of the ANSIG of a shape described by TV landmarks is contained 
in a number ks of complex coefficients, which, naturally, depends on N. 
Fig.[4]plots the number of coefficients computed as described above, as a 
function of the number of landmarks AT, for p — 0.1 (— 20dB). Obviously, ks 
can be indistinctly interpreted as either the required number of Fourier series 
coefficients, i.e., the number of PMs, or the required number of samples in the 
unit-circle to represent the shape. In fact, since the fundamental frequency 
is u)q = 1, the approximate bandwidth of the signal is ujb = &b^o = 
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Since the spectrum of h(z,9) is zero for negative frequencies, c.f. (13), it 
suffices to sample at a rate (number of points) of TV = uj s = ujb = 
(the Nyquist sampling rate u) s of twice the bandwidth is only required for 
two-sided spectra (Oppenheim et al, 1996, 1999| )). Naturally, to recover the 
original continuous ANSIG from these samples, we should use a (complex 
coefficient) filter with passband u G [0, u; 5 ) (in opposition to the traditional 
low-pass filter with cutting frequency uj s /2). 

The plot in Fig. [| also compares the required number ks of samples, 



or of PMs, with 512, the fixed number of samples used in Rodrigues et al 



( |2008a ): while for shapes described by an huge number of points (more than 
~ 40000), 512 samples may not be enough, for the majority of cases that may 
arise in practice (a few hundreds of landmarks), the required number is much 
smaller (a few dozens). Note further that ks is smaller than TV, making the 
representation based on PMs loose its maximal invariance to permutations, 
in a strict sense, when using ks coefficients. Nevertheless, the discrimination 
loss that results from this is small, since most of the energy of the signature 
is captured by the first ks coefficients 

Having motivated the usage of only ks PMs as a way to compactly rep- 
resent shapes in terms of its complexity, i.e., its number of landmarks, we 
now discuss the comparison of descriptions of shapes of distinct complexity. 
Since the first ks PMs are (the most relevant) coefficients of the ANSIG 



1 Naturally, the compactness of the representation is due to the decay of \M^\ with k\ 
imposed by the normalization factor in (J2|, which is now motivated. In Appendix [Aj we 
further discuss the issue of normalizing power sums. 
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Figure 4: The number of coefficients ks needed to represent a shape described 
by TV landmarks. 

Fourier series, it suffices to pad with zeros the smaller vector before per- 
forming the comparison in frequency domain. If, in opposition, the shapes 
are equivalently described by the sparse set of ks ANSIG samples, say Ni 
samples for one of the shapes and N 2 samples for the other, it is necessary 
to use multirate signal processing techniques to convert both to a common 



sampling rate (see, e.g., Oppenheim et al, 1999). For example, perform up- 



sampling of the ANSIGs by a factor of, respectively, L\ = lcm(A^i, iV^/iVi 
and L 2 = lcm (TVi, A^/TV^, where 1cm stands for the least common multiple, 
followed by interpolation using (complex coefficient) filters with passband, 
respectively, uu G [0, 27r/Li) and uu G [0,27r/L 2 ). 
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4 Maximal Rotation Invariance 



The representation of a 2D shape by its PMs forms a complete set of in- 
variants with respect to permutation. The extension of the set to include 
translation and scale invariance is easily obtained by removing the mean and 
normalizing the energy, according to However, taking care of the orien- 
tation is not obvious. In the following, we propose an extension to include 
maximal invariance to rotations by computing a unique orientation, definable 
for an arbitrary shape. 

A natural and the most common way to attempt to obtain rotation in- 
variance consists in finding an angle 0(z) such that, through rotation, any 
shape z is brought to its "normalized" version 

w(z) = ze- J${z) . (19) 



In fact, if the desired invariance is satisfied, i.e., if. 



, w (ze^) = w(z) , 



(20) 



the normalization in (19) produces a maximal invariant. However, current 



methods to compute the shape orientation 9(z) either fail to process particu- 
lar shapes (see examples in Figs, [j] and [2] and others in the following section) 
or do not guarantee the equality in (|20|). 



The success of the normalization in (19) hinges then on finding an ap 
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propriate function 9 : — )► (— 7r,7r] that unambiguously defines 0(z), the 
orientation of the shape z. We now show that, to guarantee the desired 
invariance, it suffices that this function satisfies a natural condition: that 
the orientation of a rotated shape is the sum of the orientation of the origi- 
nal shape with the rotation angle. Formally, this condition means that the 
function #(•) has to satisfy 



V , d(ze>+)=6{z)+<l> : 



(21) 



where the equality is modulo 2n. Simple manipulations show that (21) suf- 



fices to guarantee (20): 



w (ze^) = ze^ exp (-j 0(ze^)) 



= ze 



w z 



(22) 
(23) 

(24) 



where (22) and (24) use the definition (19) and (23) uses rt21 



In Section |6j we propose an orientation function #(•) satisfying (21). 
This is a relevant contribution to maximally invariant representations for 2D 



shapes, since the pre-processing step (19) extends any permutation- invariant 



representation, like the ones proposed by Jebara (2003) and Rodrigues et al 



(2008a), or the PMs introduced in Section 2 of this article, to also accommo 
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date maximal invariance with respect to arbitrary geometric transformations, 
i.e., including rotation. 



5 Limitations of Previous Approaches to 
Moment-based Orientation Normalization 

To obtain the orientation 0(z) of a shape z, we use the PMs of the points 
describing the shape. Since image moments have been used in the past, 
we first overview moment-based estimation of orientation and motivate the 
need to revisit the problem. The usage of Complex Moments (CMs) to 
define orientation was proposed in Abu-Mostafa and Psaltis ( 1985| ). CMs 



stand for compact representations of linear combinations of ordinary (i.e., 
real) geometric moments. In that work, the authors define and compute the 
orientation by imposing the phase of one of the moments C q +i^ q in ^ to be 
zero. When applying this method to a shape z, i.e., to an image composed 
by a set of TV mass points describing the shape, we obtain, through ([5]), the 
moments 

N 

C q+1 , q (z) = J2\zn\ 2q+1 e j ^, (25) 



n=l 



where, as introduced in Section [2| z n collects the coordinates of the n shape 
point. 

Although the method just described is adequate to deal with shapes z 
that lead to a moment C q +i iq (z) with large magnitude, there are shapes for 
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which this does not happen for any q. It was known that this is the case 



of rotationally symmetric shapes (Abu-Mostafa and Psaltis, 1985), but we 



now show it may also happen with general ones. Just look at the example 
in Fig. pi where z = [1, — exp(j/27r/3), exp(— j2n/3)] T . For this shape, 



from (25), we obtain that all the moments C g+ i ?g (z) are zero, regardless of q: 



C q+1 , q (z) = ei° + e^ 2 + e~^ 2 + e^' z + e~ j2 ^ 3 = 



For 5-fold rotationally symmetric shapes, Abu-Mostafa and Psaltis (1985) 



propose to use the phase of one of the moments C q+ s,q- However, again, there 
are 5-fold rotationally symmetric shapes for which these are all zero. For 
example, it is straightforward to show that the 2-fold rotationally symmetric 
shape in Fig. pleads to C g+ 2,g(z) = 0, W q . Although the examples in Figs. [5] 
and [6] (or others the reader may come up with) serve as mere illustrations 
of extreme cases, they also make clear that in practice it is not adequate to 
rely on the angle of these moments to robustly compute shape orientation, 
since when the magnitude of those moments is small, their phase results very 
sensitive to the noise. 

Moment-based orientation was later addressed by using Generalized Com- 



plex (GC) moments (Shen and Ip, 1997, 1999). GC moments, simply termed 



rotational moments in a previous review (including Legendre, Zernike, and 
CMs) by Teh and Chin (1988) are given by ^ and can be seen as the co- 
efficients of the Fourier series of radial projections of the image. To deal 
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Figure 5: Example of a shape (composed by the points 
{1, ±j, exp(±j27r/3)}), for which all the complex moments C q+ i^ q are 
zero, making impossible to define its orientation in terms of the phase of 
such moments. 

with ambiguities that arise when attempting to define and compute shape 



orientation from a single moment of a particular order, Shen and Ip (1997 



1999) use three non-zero GC moments with a fixed index p. The method is 
not simple: from GC pqi and GC pq2) the possibility is inferred that the shape 
is rotationally symmetric; in case there is that possibility, the unambiguous 
detection of symmetry requires an exhaustive search; if the shape is classified 
as rotationally symmetric, a third moment GC pq3 is also used to compute the 
orientation. The simple example in Fig{7] shows that this method may fail: 
consider Zi = [1, — 1/4, — 3/4] T and z 2 its reflection, i.e., Zi rotated by 7r, 
z 2 = — zi, and the choice of GC index p = 1. Using 0, we get 



GC lq ( Zl ) = 1 + l -e^ + = 1 + (-1)« = GC lq (z 2 ) , 
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Figure 6: Example of a 2-fold rotationally symmetric shape (composed by 
the points {±1, ± exp(±jV/3), ± exp(±jV/4)}), for which all the complex 
moments C q +2,q are zero, making impossible to define its orientation in terms 
of the phase of such moments. 

showing that it is impossible to distinguish between the orientations of Zi 
and z 2 from the moments GC\ q (note that the shape in Zi and z 2 is not 
rotationally symmetric, thus different orientation angles #(zi) and #(z 2 ) must 



be computed). Although Shen and Ip (1997, 1999) propose to tune the index 



p by maximizing a so-called alternating energy of the radial projection (which 
also requires exhaustive search), this method fails to exclude p — 1 for the 
example above. 

6 The Unique Orientation of a 2D Shape 
Through Principal Moment Analysis 



We now present our algorithm to compute the unique orientation of an arbi- 
trary shape, i.e., we derive a function 9(z) that satisfies property (21). Our 
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Figure 7: Example of a shape (composed by the points {1,-1/4,-3/4}), 
for which all the generalized complex moments GC\ q are the same as for its 
reflexion ({ — 1, 1/4, 3/4}), illustrating that it is impossible to unambiguously 
define their orientations in terms of those moments. 

approach to define and compute the orientation 0(z) is based on the PMs 
of the shape, given by ([5]). As we will only deal with their argument, for 
simplicity, we strip off the scaling factor and work directly with the power 
sums 

N 

fji k (z) =z* + z* + ... + z k N = ^z* : (26) 

71=1 

where k G {1, 2,3,.. .}. Note, nevertheless, that both the power sums {/^(z)} 
(J26j) and the PMs {M k (z)} @ lead to the same algorithm to be explained, 



and that both forms can be used to compute the (same) unique orientation of 
the shape. For this reason, in this section, we refer to /Xfc(z) as the fc th -order 
PM of the shape. 
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6.1 A Single Moment is Not Sufficient 

If we were to choose 0(z) = arg/xi(z) (equivalent to applying the simplest 
form of the method in |Abu-Mostafa and Psaltis| fll985| ), using do = Mi), 



we would satisfy ( [21] ). In fact, from (26), //i(ze J ^) = [ii(z)e^, thus 
arg/ii(ze J< ^) = arg/xi(z) + 0. Nevertheless, since /ii(z) = ^ n z n is pro- 
portional to the shape center, its angle is not a characteristic of the shape 
format, but only of the shape localization. In practice, to obtain transla- 
tion invariance, it is common to apply the pre-processing step ([3]), where the 
shapes are centered by subtracting their center of mass, thus \i\ is zero for 
all shapes and useless to determine a shape orientation. 

The choice 0(z) = arg/ii(z) is equivalent to imposing the argument of 



the first-order PM of the rotationally normalized shape (19) to be zero, i.e., 
imposing arg/xi(ze -J <9 ( z )) = 0. Our approach is to generalize this method by 
doing the same to the fc th -order PM, assumed to be non-zero: 

arg/^ze"^) =0. (27) 

Note that one non-zero PM must exist, as otherwise all landmarks of the 



shape are at the origin. To solve for 0(z), use the definition (26) to rewrite 



(j27j) as 

N 

arg J^e-^ (z) =0. 

71=1 
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Since complex arguments are defined modulo 27r, we get 

N 

axg5^-fc0(z)+27rZ = O, (28) 



n=l 



where I is an integer. Now, we express the solution(s) for the normalization 
angle as 

arg/i fc (z) 2tt 

0(z) = + — / , / G {0, 1, . . . , k - 1} , (29) 

where we used (l26j) again and noted that only k values of I lead to distinct 



solutions for 0(z). Expression (29) makes clear that an ambiguity arises when 
attempting to define the normalization angle using the fc th -order PM alone 
(for k 1): there are k (modulo 2tt) different values of 0(z) that annihilate 
the argument of ^(ze - - 7 '^). 

As previously referred, the normalization angle 0(z) needs to satisfy prop- 



erty (21), i.e., the normalization angle of a rotated shape must be equal to 
the one of the original shape plus the rotation angle. We derive what this 
condition imposes to the solution for 0(z) that must be picked from the set in 



(29). By proceeding in a similar way as in (27)-(28), we express the argument 
of the fc th -order PM of a rotated shape as 

arg/i/^ze^) = arg/x^z) + k(f> + lid, (30) 



where I is an integer that guarantees that the argument of //^(ze- 7 ^) falls 
within the interval where this operator is defined, e.g., (— 7r,7r]. Using this 
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result and (29), we obtain the normalization angle of the rotated shape: 



We now see that the verification of the desired property (21) hinges on the 



choice of the integer I in the definition (29) of the normalization angle. In 



fact, if one decides to simply pick a given fixed /, i.e., to choose the same / for 



all shape vectors z, (31) becomes 9(ze^) = 9(z) + (p+ (27r/fc)Z, showing that 
property (21) is satisfied if and only if I = (mod k). However, this can not 
be guaranteed, since in general equality ( |30| ) requires distinct values of I for 
distinct just imagine (j) ranging from to 2tt and note that arg/x^ze- 7 ^) 
would exhibit jumps (in order to maintain its value within (— 7r,7r]) corre- 
sponding to a changing value of /, at values of spaced by intervals of length 
2n/k. 

6.2 Using a Pair of Moments 



The crux of our approach is to define the normalization angle in (29) by 
selecting a value for / that depends on the shape vector z. We will show 
that it is always possible to select /(z) according to our method, i.e., that 
it unambiguously defines a normalization procedure, and that the resulting 



normalization angle 9(z) satisfies property (21), thus guaranteeing maximal 
invariance to rotations in shape-based classification. 

To achieve this, we use a supplementary non-zero PM, /i m (z), with k and 
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m coprime, i.e., with gcd(fc, m) = 1, where gcd denotes the greatest common 
divisor. The case where there do not exist coprime non-zero moments will 
be treated in the next subsection. Note additionally that there are no shapes 
(with a finite number of points) with all but one non-zero PM^J Consequently, 
all shapes (except the one with all landmarks at the origin) have at least two 
non-zero moments, although not compulsorily coprime. 

Our choice for Z(z) is based on the arguments of the k th - and m th -order 
PMs. For simplicity, denote the argument of the m th -order PM of the nor- 
malized shape by i/(z, /), where / G {0, 1, . . . , k — 1} stands for the particular 
integer used in ([29]) to compute the normalization angle. Then, 



Kz,/) = arg/i m (ze-i 6 ^) 

N 



arg5>--m0(z,O (32) 



71=1 



arg/x m (z) - — arg/x fc (z) - — 2tt/ , (33) 



where we used the definitions of the PMs (26) in (32) and of the normalization 



angle (29) in (33), emphasizing its dependence on Z. It is easy to verify that, 
since k and m are coprime, the set { — (m/k)27rl : I — 0, 1, . . . , k — 1} is the 
same as the one expressed by {(l/fc)27rZ : I = 0, 1, . . . , k — 1} modulo 27T, 



2 A simple way to show this is to use the fact that the first N PMs fully specify a shape 



with N points (Kanatani, 1990). Consider all of them are but fik = 1, for an arbitrary 
choice of k. From this, compute the shape and from the shape compute the remaining 
(higher order) PMs (or compute these directly using Newton's identities). It will be clear 
from the resulting expression that they can not be all 0. 
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i.e., for each element of the first set, there is an element of the second one 
that differs by a multiple of 2tt and vice-versa. Thus, the set {^(z, I) : / = 
0, 1, . . . , fc — 1} contains k different elements (mod 2n) spaced by intervals 
of length 2ir/k. We propose to unambiguously choose Z(z) so that z/(z,Z(z)) 
falls within an arbitrary but fixed interval of length 2n/k: 



z/(z,Z(z)) e I (mod 2tt), (34) 

where the interval 

I = {\: A < A < A + 27i/k} , 



defined by an arbitrary but fixed A G K, is independent of z. 



The ambiguity in the definition of the normalization angle 9(z) in (29) is 



now solved through the choice of l(z) in (34), but we still have to check that 



this solution satisfies property (21), i.e., that the normalization angle of a 



rotated shape equals the one of the original shape plus the rotation angle. 
As derived above, the normalization angle of a rotated shape is given by 



fl(ze^) 



m _ arg/x fe (z) 



k 



2tt 
~k 



l(ze j *) + I 



(35) 



This last expression is a simple rewrite of (31), now emphasizing the depen- 
dance of I on the (rotated) shape; l(ze^) is thus the solution of (34) for the 



rotated shape ze^. To express the right side of (35) in terms of 0(z), we 
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must relate l(ze^) to Z(z). This is done by expressing the argument of the 
777 th PM of the normalized rotated shape in terms of the one of the original 
shape: 



i/(ze>'*Z(ze^)) 



777, 



arg/i m (ze J( ^) — — 27r/(ze- 



arg ii k (ze 



j4>\ 



ui 



arg/x m (z) + met) — -2nl(ze J(p ) 
k 



m 

T 



arg /ijfe(z) + fc0 + 27r/ 



777 777/ 

arg /i m (z) - — arg /j fc (z) - — 2tt ( Z(ze J0 ) + / 



i/ (z,Z(ze^) + f) 



(36) 

(37) 
(38) 
(39) 



where (36) uses (33), (37) uses (30), (38) are simple manipulations, and (39) 



uses (p3| again. From rt39| and our choice (34) applied to ze J( ^, we see that 



i/(ze^,Z(ze^)) = Kz,/(ze^) + G /. Due to the fact that J is fixed, i.e., 
that it does not depend on the shape z, and due to the uniqueness (mod k) of 



the solution in (34) with respect to /, /(ze J( ^) + I must be equal to l(z) (mod 



k). Replacing this concluding equality into (35) and using the definition of 



0(z) in (29), property (21) is immediately obtained. 

We call our method Principal Moment Analysis (PMA). 
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6.3 Dealing with Rotational Symmetry 

We now deal with the case when there are no coprime k and m such that 
[ik and ji m ^ 0, in which case it is impossible to apply the method just 
described directly. Let 7 be the greatest common divisor with respect to all 
the orders of non-zero PMs, i.e., 7 = gcd(X), if = {fc G Z + : ^ / 0}. 
When there are no k and m coprime such that /i^O and ji m ^ 0, we have 
7 > 1. This is equivalent to the existence of a 7-fold rotational symmetry, 
i.e., that the shape is invariant to rotations of 2^/7. A simple way to derive 
this equivalence is to use the fact that, as shown in Section [3j the PMs are 
the coefficients of the Fourier series of the restriction h(z,9) in ^ of the 
ANSIG to the unit circle: since the rotation of a shape propagates into its 



ANSIG ( Rodrigues et al[ 2008a), 7- fold rotationally symmetric shapes lead 
to /i(z,#) with period 271-/7, whose non-zero Fourier series coefficients will 
only occur at multiples of 7; conversely, if the Fourier series coefficients only 
occur at multiples of 7, /i(z, 9) has period 271-/7 and, as such, using the same 
propagation property, the shape is 7-fold rotationally symmetric. In this 
case, all normalization angles of the form 

= #0 + ^/7, keZ (40) 



lead to the same normalized shape. Hence, to compute a normalization angle, 
it suffices to compute the Fourier series coefficients of the function fo(z, 0/7) 
instead of the ones of /i(z, 9) (in the variable 0), then to use these coefficients 
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(i.e., the PMs) in the PMA as described in the previous subsection, obtaining 
an angle 0q, and, finally, to invert the expanding effect of h through the 



contraction of 6' 0: i.e., assign 9 = #0/7. Any 9 in (40) can then be used. 
In terms of the PMs, it is easy to see through the properties of the Fourier 



series (Oppenheim et al, 1996) that this procedure is simply equivalent to 
using the PMs of orders jk and 7m instead of the original ones of orders 
k and m, respectively, and then contracting the resulting angle. Finally, a 
last equivalent method is to compute the PMs ^ or (26) directly from the 
"powered" shape vector [z^, z\ , . . . , z]^] T (equal to the spectrum of /i(z, 0/7) 
up to a real positive scaling factor), apply the PMA and contract the result. 



6.4 Improving Robustness 

Until now, we presented a theoretical proof for the correctness of PMA to 
unambiguously compute the orientation of arbitrary shapes using a pair of 
moments. Since in practice it is also important to obtain robustness to noise, 
we now describe how to improve the robustness of PMA by using a larger set 
of PMs. In fact, PMA can be used with any pair of coprime indices (fc,m), 
provided that /x*. 7^ and /x m 7^ 0. In order to improve robustness, we 
integrate the contributions of several pairs (fci,rai), (/c2,m 2 ), . . (^m^m), 
by computing pairwise estimates #i(z), i — 1, 2, . . . , M, and defining a robust 
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normalization angle 0(z) as the (angular) weighted average of them: 



M 



0(z) = arg Wj e j e% 



(41) 



2=1 



The reason for the angular average is its ability to deal with angles close to 
the region of circular discontinuity. For instance, we want the average of 1° 
and 359° to be arg(exp(j 1°) + exp(j 359°)) = 0°, not (1° + 359°)/2 = 180°. 



The proof that property (21) holds for the robust normalization angle defined 



in (41) is straightforward: 



M 



9(ze j( ^) = arg ^ wi e 



M 



arg 



M 



= arg^^ e je ^ z) + 
= 0{z) + <j>, 



(42) 



(43) 



where (42) uses the fact that the individual 0$(z) satisfy (21) and (43) uses 



definition (41). 



6.5 Implementation Details 



In practice, PMA was implemented the following way: we start by computing 



the first 20 power sums in (26) and normalizing their magnitude by dividing 
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jik by its absolute value raised to k ) i.e., by using fa = ^k/\l^k\ k instead of 
Hk- This normalization aims to cancel the growth of the magnitude of the 
power sums with k. Note that alternatives to this normalization include the 
one discussed in Appendix [A] or the usage of the PMs as defined @). Then, 
we detect non-zero PMs by thresholding their magnitude, i.e., we only use 
if 1/4 1 is above a threshold, say, 10 -3 . In what respects to experimental tun- 
ing, our method only requires dealing with the two parameters just referred. 
Finally, we compute the greatest common divisor of all the indices corre- 
sponding to non-zero PMs and run the algorithm described in the previous 
section. 

In the second step of the algorithm (the core of the method) , for each non- 
zero moment /i/e, we find the smallest index m that is coprime with k (the 
first step described above guarantees that this index always exists). Then, 
we search for /(z) G {0, 1, 2, . . . , k — 1} until we find ^(z, /(z)) mod 2tt G / 
(we choose I = (— n/k, ir/k]). The value found for /(z) is used to compute 
the pairwise estimates 6^(z), which are then averaged using weights given by 
Wi = \fiki Am J- The rationale for the choice of these weights is that PMs with 
larger magnitude have an argument less sensitive to noise. Note that the 
PMs used in the weights are the normalized ones, being this normalization 
thus also important to avoid over- weighting angles 9i(z) computed from PMs 
of large order. 
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7 Extension to Grey-level Images 

The algorithm presented in the previous section computes a unique orienta- 
tion 0(z) for an arbitrary set of landmarks {z\ ) z 2) . . . , Zjv}, satisfying prop- 
erty ( [21] ). We now generalize the concept to compute a unique orientation 
of a continuous image g(x,y). For that purpose, generalize the moments of 
( |26| ) to the equivalent ones of |Abu-Mostafa and Psaltis] (1985), i.e., make 



MfcG?) = Cko(g). From ([i]), the PMs of the grey-level image g{x,y) are thus 

p p+oo 

Pk{g) = / {x + jyf g(x,y)dxdy , (44) 

«/ J —oo 



with e {0, 1,2,.. .}. In what respects to representation, the generalization 
to continuous images loses completeness: in opposition to the case of a set of 
points, discussed in Sections [2] and |3j the PMs in (44) do not determine the 



image g{x ) y) uni vocally. An immediate way to conclude that is focusing on 



radial images. Start by rewriting (|44|) in polar coordinates: 



/7T POO 
/ r k e jk0 g(rcos9,rsm9)rdrd9. (45) 
-TT A) 



Now, for the radial image g(r cos#, r sin9) = R(r), from (45), we easily get 

l^kig) = 0, for k e {1, 2,3,.. .}, and fio(g) = 27T J °° R(r) r dr. As this integral 

does not define the function R(r) univocally [^J the PMs do not determine 

3 For example, the functions R^r) = H "(r) - H (r - y/2/S) and R 2 (r) = r(H(r)-H(r- 
1)), where H(-) denotes the Heaviside step function, lead to the same value for the moment 
/i = 27r/3. 
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Naturally, the lack of completeness just referred does not impede the 
extension of PMA, our rotational normalization algorithm presented in Sec- 
tion |6| for continuous images. A simple way to derive this extension is by 
using the derivations in Sections [4] and [6] with new definitions of the ob- 
jects at hand, i.e., with moments and rotations of images instead of shapes. 
The symbol z is now interpreted as an image g{x ) y) and ze^ is inter- 
preted as the image that results from the (counterclockwise) rotation of z 
by the angle (j). With these definitions, its trivial to show that it still holds 
[ze^]e^ = ze j( ^ + ^ = [ze^]e^ (because image rotation is associative and 
commutative) , thus the derivations in Section [4] remain valid for the inter- 
pretation in terms of continuous images. 

As far as the derivation of the PMA algorithm in Section [6] is concerned, 
the reader should note that it is entirely based on the property /x&(ze J ^) = 
/ifc(z)e j7c< ^. We have to show that this property extends to the interpretation 
in terms of grey-level images, i.e., that the fc th -order PM of an image rotated 
by (j) equals to the product of the fc th -order PM of the original image by 
e jk(f) (notice that the multiplication of by is an ordinary one, not a 
rotation, since is a complex number, not an image). Using the definition 
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of PMs in (45), the desired property is immediate: 



r k e jk0 g(rcos(e-(t)), rsin(0-») r dr d9 

.. „ 



/7T /»C 
7T «/0 



r k e Jk[ ^ } g(r cos 0, r sin 0) r dr d# 



/7T /»00 
/ r k e jk0 g(r cos 0, r sin 0) r rfr d0 

Having shown how to extend PMA to grey-level images, we end this 
section by emphasizing that some care must be taken with the claims about 
the evidence of rotational symmetry and the universality of the algorithm. 
When dealing with sets of landmarks, as derived in Section |6j the moments /!& 
are nonzero only for indices k multiples of 7 > 1 if and only if the shape is 7- 
fold rotationally symmetric. However, this equivalence is not fulfilled for the 
case of continuous grey-level images. To get insight on what happens in this 



case, start by interpreting the moments in (45) in terms of the Fourier series 
of periodic signals obtained by circularly slicing the image. In particular, the 
fc th -order PM of the polar coordinate image /(r, 0) = f g(r cos0,rsin0) can 
be written as 

POO 

fJL k = r k+1 F(r,k)dr , (46) 
Jo 

where 

f(r,6)e jke d9 (47) 

-7T 
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are the coefficients of the Fourier series of the 27r-periodic (in 6) signal /(r, 6). 
Naturally, an image /(r, 9) is 7- fold rotationally symmetric if and only if it is 
27r/7-periodic in 0, for all r > 0. Hence, the coefficients F(r, k) of its Fourier 
series are nonzero only for k multiple of 7 if and only if there is 7-fold 



symmetry in f{r,6) (see, e.g., Oppenheim et al, 1996). In turn, the nonzero 



moments ji^ in (46) only occur for k multiple of 7 in this case and PMA 



can be used as described in Subsection |6.3[ However, this last statement 
is not an equivalence, since nonzero coefficients F(r, k) may be destroyed 



by the integration in (46). In fact, as we detail in Appendix |Bj there exist 
very particular grey-level images that are not rotationally symmetric but 
have moments /i^ that are nonzero only for indices k multiples of 7 > 1. 
There even exist images with a single non-zero moment (other than the radial 
ones referred in the first paragraph of this section; these last ones are not 
normalizable in what respects to orientation). Naturally, as any other method 
based on these moments, PMA fails to process these images. 



8 Experiments 

We now describe experiments. The following subsections focus on illustrating 
the compactness of the PM-based shape representation (Subsection |8.1 ), its 



usage in classification (Subsection |8.2[ ), and the results of PMA for rotational 
normalization of shapes (Subsection |8.3[ ) and grey-level images (Subsection 
8i4h. 
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8.1 PMs for Shape Representation 



In this subsection, we focus on showing that the computational saving that 
arises from using our PMs does not degrade performance when compared 
with the discriminative ANSIG, the densely sampled signature introduced 



by Rodrigues et al (2008a). We illustrate this point with the shape shown 



in Fig. |8j which is described by 7 landmarks. The plot in Fig. [9] shows 
the magnitude of the PMs {M^} of this shape. Proceeding as described in 
Section |3j we obtain the required number of PMs for this shape, ks = 6. 
As easily perceived from Fig. [9J the mag nitude of the 6 th PM is very small, 
indicating that the first 6, {M/^O < k < 5}, containing the majority of the 
energy, adequately describe the shape. Since the shape was pre-processed as 
in ([3]) , we obtain Mi = and M = 1 (this last PM is not represented in the 
plot). 



-0.15 -0.1 -0.05 0.05 0.1 0.15 



Figure 8: A 2D shape described by a small number of landmarks. 



In Fig. 10, we represent, with solid lines, the magnitude and phase of 
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Figure 9: Magnitude of the PMs of the shape in Fig. [8] (or of the coefficients 
of the Fourier series of its analytic signature) . 

the densely sampled ANSIG of the shape in Fig. [5J As shown in Section |3j 
the coefficients {H^} of the Fourier series of this periodic complex signal are 
given by the PMs of the shape. To verify this in practice, we computed the 
Fast Fourier Transform (FFT) of the vector collecting the dense sampling of 
one period of the ANSIG, since it is straightforward to derive that this FFT 



is equal to {H^} multiplied by the number of samples (Oppenheim et al 
|1996[ [1999). As expected, we concluded that the Fourier series coefficients 
{Hk} coincide with the PMs {M^}, whose magnitude is represented in Fig.|9j 
In the plots of Fig. [lOj we also compare the densely sampled ANSIG 
with a signature obtained by interpolating our very compact representation. 
As derived in Section [3| the required number ks = 6 of coefficients needed 
to represent the shape can be interpreted either as the minimum number 



of PMs or the minimum number of samples of the ANSIG. In Fig. 10, we 
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represent these ks = 6 samples with stars, showing how much sparser this 
representation is when compared with the densely sampled ANSIG. Finally, 
we superimpose, represented by dashed lines, the reconstruction obtained by 
interpolating this sparse set as described in Section [3j We see that the lines 
of both plots are visually indistinguishable, showing that the PMs are ade- 
quate to represent the ANSIG and, consequently, the underlying shape. To 
illustrate what happens when using less samples than the minimum required 
by our study, we repeat the procedure by interpolating from 4 samples, ob- 
taining Fig. [TT] As easily seen, the reconstructed signature differs from the 
dense sampled one. Note, nevertheless, that there is no guarantee that our 
bound ks is tight. 



8.2 PMs for Shape Classification 

In our experiments, the behavior illustrated in the previous subsection was 
observed in general, i.e., a set of ks PMs always suffices to accurately describe 
the densely sampled ANSIG. Although this is enough to guarantee that the 
same results are obtained when classifying shapes described by either their 
small sets of PMs or their dense ANSIGs, we also verified this directly. In 
particular, we generated noisy versions of prototype shapes and classified 
them by using 1-NN, i.e., by just selecting the prototype that had most 
similar description. The number of PMs ks ranged from 15 to 22, thus 
our descriptions are much shorter than the vectors of 512 ANSIG samples 



used in Rodrigues et al (2008a). We performed hundreds of tests for each 
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shape, obtaining the same performance (100% correct classifications, except 
for shapes that are visually indistinguishable), for both the densely sampled 
ANSIG and the PMs. Since the ANSIG was extensively demonstrated in 
shape-based classification of real images (Rodrigues et al, 2008a|b| ) and we 



have shown that the PMs have similar behavior, we do not report here other 
experiments on PM-based shape classification. 

8.3 PMA for Shape Normalization 

Although in the paper we have theoretically proven the correctness of PMA 
for shape normalization, i.e., that it succeeds in unambiguously computing a 
unique orientation for any shape, the impact of PMA in shape-based recog- 
nition applications is also determined by the sensitivity to the noise, since 
observations of similar shapes must originate similar normalization angles. In 
this subsection, we illustrate that PMA is able to deal with these situations. 

We start by illustrating that our method disambiguates the direction of 
the principal axis of generic shapes, i.e., shapes without rotational symmetry. 
We used noisy versions of the shape in Fig. [TJ see examples on the left column 



of Fig. 12, and computed the correspondent normalization angles using PMA. 



The right column of Fig. 12 shows the resultant shapes, i.e., the rotationally 
normalized versions of the corresponding shapes on the left. We see that, 
regardless of the noise, all the normalized shapes exhibit similar orientation. 



Although the examples in Fig. 12 illustrate the disambiguation of di- 



rection, the accuracy of the estimates of the normalization angle is better 
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evaluated by contrasting them with the ground truth. We thus performed 
experiments by rotating noisy shapes according to a known angle, ranging 



from — 7r to 7r, and then estimating the orientation. The plot in Fig. 13 



sum- 



marizes the results. We see that the estimates obtained by using PMA are 
very close to the correspondent true orientation, for all values of the rotation 
angle. For illustration purposes, the plot also shows the results obtained by 
using PC A, which naturally exhibit the directional ambiguity discussed in 
Section [T] and illustrated in Fig. [TJ 

In a similar way, we now illustrate that PMA also deals with noisy ob- 
servations of rotationally symmetric shapes. We used noisy versions of a 
three-fold rotationally symmetric shape that would be impossible to orient 



using PCA, see the left column of Fig. [14j and normalized their orientations 
using PMA, obtaining the visually correct results on the right column of 



Fig. 14, By proceeding in a similar way as described above, we contrasted 



the estimates with their ground truth, obtaining the plot in Fig [15} Note 
that, in this case, there are three values for the true rotation angle, corre- 
sponding to three consistent orientations, due to the three-fold symmetry 



of the shape, see expression (40). Since the shape is rotationally symmetric, 
PCA is useless for the determination of an orientation, providing results only 
determined by the noise. In opposition, the plot in Fig [15] shows that the 
estimates obtained through PMA are robust to noise. 

To illustrate the robustness of PMA to the shape sampling density, we 
used Japanese characters. We synthesized corrupted versions of those char- 



46 



acters by removing up to 95% of the shape points, obtaining shape vectors 
of the form of expression but of very distinct cardinality. We then pro- 



cessed these vectors by using PMA. In Fig. [16j we single out five instances of 
a specific character to illustrate the consistent orientations obtained for all 
the corrupted characters. 

8.4 PMA for Normalization of Grey-level Images 

Finally, we illustrate the usage of PMA to rotationally normalize grey-level 
images. We used real images, consisting of photos of trademark logos, see 



examples in the left columns of Figs. 17 19} By proceeding as described in 



Section [7| we used PMA to normalize the orientation of the photographed 
logos. In spite of geometric distortions (e.g., perspective, radial) and other 
intensity disturbances in the images, we got consistent results, as illustrated 



by the oriented versions of the logos in the right columns of Figs. 17 19 Note 
that these examples correspond to particularly challenging grey-level images, 
due to approximate rotational symmetry. 

Since we detailed several situations where current methods fail, see Sec- 
tion [5| we do not report here experimental results obtained with those al- 
gorithms. In fact, it would be easy to produce examples where estimates 
obtained through PMA would be much more accurate than those obtained 
by using other methods (just imagine using shapes similar to the ones in 
Figs. [5| [6| or However, we found it would be more informative to present 
the discussion in Section [5] regarding the core limitations of those methods, 
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i.e., to show how they attempt to use information that is not available in all 
shapes, than to blindly report sample experiments to support our approach. 

9 Conclusion 

We proposed to represent 2D shapes, i.e., sets of unlabeled points or land- 
marks, via particular complex moments that we call Principal Moments 
(PMs). This representation is complete and we show it is compact, in the 
sense that the number of PMs needed to discriminate between shapes is small 
(and dependent on their complexity). We further presented a new method, 
Principal Moments Analysis (PMA) to unambiguously compute a unique 
orientation for arbitrary 2D shapes. This enables performing rotational nor- 
malization, thus obtaining maximally invariant (i.e., complete) representa- 
tions for 2D shapes. We finalized by extending PMA to the normalization 
of grey-level images. Besides theoretically sound, PMA results are robust to 
noise. 
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A On the Normalization of Power Sums 



In this appendix, we derive an expression for the expected growth of the 
magnitude of the power sum 

N 

fi k = J24, ke {1,2,3,...}, (48) 

71=1 

under reasonable assumptions for the set of 2D points {z n }. 

Let {z n) n — 1, 2, . . . , TV}, be samples of a complex random variable (r.v.). 
Due to the common pre-processing of centering the shape ([3]), we assume 
this r.v. is zero mean, i.e., E{z n } = 0,V n . We also assume the likelihood of 
each direction is the same, i.e., the angle arg(z n ) is uniform in [0, 27r) and 
independent of the absolute value \z n \. Under these assumptions, we obtain 

E{z k ] = E{\z\ k e jkarg(Zn) } 

= E{\z\ k }E{e? kaxe{Zn) } (49) 
= 0, (50) 



where (49) is due to the independence between \z n \ and arg(z n ) and (50) is 
due to the uniformity of arg(z n ). Using this result, we conclude that the 
mean value of the power sums is zero: 



N } N 



E{M = E =E E K}=0- (51) 



. n=l J n=l 
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The issue we address in the sequel is the expected grow of |//&|- 

We start by expressing E{|/i&| 2 } in terms of a moment of the real r.v. \z\ ) 
through the chain of equalities 

E{|H 2 } = E{^4} 

!N N 
n=l m=l 



n,m=l 
AT 

= 5>{(^;) fc } (52) 



n=l 

TV 



n=l 

= jVE{|^| 2fe }, (53) 



where * denotes the complex conjugate and (52) is due to the independence 
between z n and z m for n ^ m. Expression (53) states that E{|/i/e| 2 } is pro- 
portional to the moment of order 2k of the real r.v. \z\ (this type of moments 
is often referred as raw moments, to emphasize that the corresponding r.v. is 
not zero mean, as it is obviously the case of \z\). 

Depending on the probability density function (p.d.f.) of the r.v. z, we 
obtain different growing rates for For example, if the p.d.f. of z, besides 
being circularly symmetric on the complex plane, is Gaussian, i.e., if z is 
a 2D Gaussian r.v. with co-variance proportional to the identity matrix, its 
absolute \z\ is a Rayleigh r.v. (see, e.g., Papoulis, 1991). The p th -order raw 
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moment of a Rayleigh r.v. is given by 



M p = E{\z n \ p } = a p 2P/ 2 T(l+p/2) 



(54) 



where 



T(x) = I t x - L e- l dt 



is the Gamma function, which, for an integer argument, is given by 



r(g) = (g-l)! 



(55) 



(see Papoulis, 1991). From (53), (54), and (55), we finally get 



E{|^| 2 } = NM 2k 

= Na 2k 2 k T(l + k) 
= N(2a 2 ) k k\. 



(56) 



Thus, for shapes respecting our assumptions, the moment jik should be nor- 



malized according to the square root of (56), which, using Stirling's approxi- 



mation (Paris and Kaminsky, 2001), can be shown to be ^/E{|/x/e| 2 } = o(k\). 
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B Grey-level Images That Make PMA Fail 

In this appendix we discuss the conditions under which grey-level images have 
only a few (or even a single) nonzero moments, preventing PMA to work as 
desired. As anticipated in Section [7| the limitations come from images that 
are not rotationally symmetric but "appear to be", in the sense that their 
nonzero moments occur only for indices k multiples of a given 7 > 1. 
These images /(r, 9) have nonzero Fourier series coefficients F(r, fc), given 



by (47), that are canceled out in by the integral in (46). 



Consider, as an example, the image 



/(r, 9) = -R(r) (cos 9 + cos 29) . (57) 

7T 



It is clear that the image f(r,9) is not rotationally symmetric, since the 
fundamental period of cos 9 + cos 29 is 2ir. The Fourier series coefficients 



F(r, k) in (47), for non-negative fc, i.e., the coefficients that determine the 



PMs in (46), are given by 

F(r, k) = R(r) (S(k - 1) + S(k - 2)) , (58) 

where S(-) denotes the Dirac delta function. This expression is easily obtained 
from the Fourier series synthesis formula /(r, 9) = l/2n Ylk=-oo ^( r > ^)e~i ke 



(see, e.g., Oppenheim et al, 1996). From (|46|) and (|58|), we obtain the PMs 
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of the image /(r, 9): 



oo poo 



'0 Jo 



It is now clear that we can specify a function R(r) such that only one PM is 
nonzero. For example, with 



R(r) = H(r) - 2H(r - 1) + H(r - yft) , 
where H(-) denotes the Heaviside step function, we obtain: 



(59) 





= 


/ii 


= 




1 








= 


/i 4 


= 




= 




= 0. 



1-^2 



(60) 



This shows that the PMA algorithm fails to process the image specified by 



(57) and (59), since, from (60), besides wrongly assuming the image is 2-fold 



rotationally symmetric, PMA would fruitlessly search for a pair of nonzero 



PMs. This grey-level image f(r,0) is shown in Fig. 20 
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We distinguish two cases where PMA applied to grey-level images fails, 
caused by the failure of two different parts of the algorithm. Namely, PMA 
fails when the image /(r, 6) 

• has a single nonzero PM, caused by 



nonzero Fourier series coefficients F(r,k) in (47) for more than 



one value of k but nonzero moments in (46) for a single value 
of k] 



— nonzero F(r,k) in (47) for a single k > 1 (the case k = cor- 
responds to a radial image, not normalizable in what respects to 
orientation) ; 

• has zero PMs for all orders k not multiple of a given 7 > 1 without 
being 7- fold rotationally symmetric, caused by 



— at least one Fourier series coefficient F(r, k) in (47) is nonzero for 
k not multiple of 7 but the nonzero moments /i& in (46) occur only 
for k multiple of 7. 

In the case of a single nonzero PM, the algorithm fails to find co-prime pairs, 
whereas in the case of nonzero PMs only for k multiple of a given 7 > 1, 
the failure lies on the incorrect detection of a 7-fold symmetry. Grey-level 
images of these classes appear to be somewhat particular, as the example 



in Fig. 20 illustrates, thus we did not face any failure when processing real 



images of trademark logos, see examples in Figs. [17}fl9 
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Figure 10: Magnitude and phase of the analytic signature of the shape in 
Fig. [8| The very sparse representation we propose leads to plots that re- 
sult visually indistinguishable from those obtained by dense sampling (after 
reconstruction). 
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Figure 11: Same as in Fig. [TOJ now with a number of samples smaller than 
the one required by our study. Note how the interpolated signature now 
differs from the dense sampled one. 
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Figure 12: Examples of using PMA for computing the orientation of general 
shapes. Left: original shapes; right: PMA oriented shapes. Relative to the 
ambiguity illustrated in Fig. [TJ note that PMA disambiguates the direction 
of the principal axis. 
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;ure 13: Comparison of PMA and PCA for general shapes. 
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Figure 14: Examples of using PMA for computing the orientation of rotation- 
ally symmetric shapes. Left: original shapes; right: PMA oriented shapes. 
In spite of the absence of a principal axis, and the high level of noise, PMA 
provides consistent orientations. 
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Figure 15: Comparison of PMA and PCA for rotationally symmetric shapes. 
In this case, with a three-fold symmetric shape, the ground true is repre- 
sented by three lines corresponding to the three consistent orientations, i.e., 
separated by 2n/3. 



64 



3* 4* 



Figure 16: Using PMA to normalize corrupted Japanese characters. 
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Figure 20: Example of a grey- level image for which PMA fails. 
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